Scale Alignment In Multidimensional Rasch Family Models

Leah Feuerstahler

November 3, 2025

Context

  • Did this student score better on Verbal Reasoning or Quantitative Reasoning? What about analytic writing?

Context

Even though social development is qualitatively different than cognition, it is natural to compare these scores

Context

  • Data are scored using Rasch family models

  • Between-items multidimensionality

    • Each item belongs to exactly one dimension
  • Dimensions “hang together”

    • Unifying higher-order dimension

    • Reference dimension (Ackerman, 1992)

    • Weighted average of dimensions

Unidimensional Rasch Model

\[ \text{logit}(P_{ij}) = \theta_j - \delta_i \]

  • \(P_{ij}\): probability of a keyed response for person \(j\), item \(i\)

  • \(\theta_j\): person \(j\)’s location (ability)

  • \(\delta_i\): person \(i\)’s location (difficulty)

\[\text{logit}(P_{ij}) = \alpha(\theta_j - \delta_i)\]

  • \(\alpha\): steepness parameter often (but not necessarily) equal to 1

Multidimensional Rasch Model

\[\text{logit}(P_{i(d)j}) = \alpha_d(\theta_{dj} - \delta_{i(d)})\]

  • \(i(d)\): item \(i\) belonging to dimension \(d\)

  • \(\alpha_d\): might change across dimensions

Under the unidimensional Rasch model, items can be uniquely ordered

  • \(\delta_i\) is ordered according to the item’s difficulty

  • \(\hat{p}_i\) (the proportion of correct responses to item \(i\)) is a sufficient statistic for \(\delta_i\)

  • \(\hat{p}_i\) and \(\hat{\delta}_i\) are related by a unique bijective function

Identifying Multidimensional Rasch Models

ConQuest TAM 1PL
Latent Means Estimate \(\mu_1\), \(\mu_2\) Set \(\mu_1 = \mu_2 = 0\) Set \(\mu_1 = \mu_2 = 0\)
Item Difficulties Set \(\sum\hat{\delta}_{i(d)}=0\) Estimate \(\delta_{i(d)}\) Estimate \(\delta_{i(d)}\)
Latent Variances Estimate \(\phi_{11}\), \(\phi_{22}\) Estimate \(\phi_{11}\), \(\phi_{22}\) Set \(\phi_{11} = \phi_{22} = 1\)
Item Steepnesses Set \(\alpha_{d}=1\) Set \(\alpha_{d}=1\) Estimate \(\alpha_d\)

Example Dataset

1000 responses from Australian students to 12 multiple-choice TIMSS items

  • These data are described in chapters 3, 9 of the ConQuest manual

  • 6 items measure math ability

  • 6 items measure science ability

Wu, M., Adams, R., Wilson, M., & Haldane, S. (2007). ACER ConQuest: Generalised item response modelling software (Version 2.0) [computer software]. Melbourne, Australia: ACER.

Example Dataset

Identifying Multidimensional Rasch Models

Transform \(\alpha_d\), \(\theta_{dj}\), \(\delta_{i(d)}\) to \(\tilde{\alpha}_d\), \(\tilde{\theta}_{dj}\), \(\tilde{\delta}_{i(d)}\)

  • Scaling constant \(r_d\): changes the slope of dimensions

  • Shift constant \(s_d\): changes the location of dimensions

\[\tilde{\alpha}_d = \frac{\alpha_d}{r_d}\]

\[\tilde{\theta}_{dj} = r_d \theta_{dj} + s_d\]

\[\tilde{\delta}_{i(d)} = r_d \delta_{i(d)} + s_d\]

Original latent mean and variance: \(\mu_d\), \(\sigma^2_d\)

Transformed latent mean and variance: \(r_d \mu_d + s_d\), \(r^2_d \sigma^2_d\)

Evaluating Alignment

As \(\hat{p}_{i(d)}\) increases, the corresponding \(\delta_{i(d)}\) decreases

Theoretical definition:

Scales are aligned if the same sufficient statistic implies the same parameter estimate, regardless of dimension

Practical evaluation:

Scales are aligned if the absolute rank-order correlation between \(\hat{p}_{i(d)}\) and \(\hat{\delta}_{i(d)}\) (combining dimensions) equals 1

Analogy: factor rotation in exploratory factor analysis

  • In practice, the optimal alignment transformation may not be unique

How to Achieve Alignment

Reference Dimension Approach

  • Find item ordering by fitting all items to a unidimensional model

  • Delta dimensional alignment (DDA)

  • Originally described by Schwartz & Ayers (2011), others

Functional Approach

  • Estimate the functional relationship between \(\hat{p}_{i(d)}\) and \(\hat{\delta}_{i(d)}\)

  • Logistic regression alignment (LRA)

Delta Dimensional Alignment (DDA)

  1. Fit data to a unidimensional model \(\mathcal{U}\): \(\hat{\boldsymbol{\delta}}_{\mathcal{U}d}\)

  2. Fit data to a between-items multidimensional model \(\mathcal{M}\): \(\hat{\boldsymbol{\delta}}_{\mathcal{M}d}\)

  3. For each dimension \(d\), find the transformation parameters \(\hat{r}_d\) and \(\hat{s}_d\) such that \(\hat{\boldsymbol{\delta}}_{\mathcal{U}d}\) and \(\hat{\boldsymbol{\delta}}_{\mathcal{M}d}\) have the same mean (\(\text{mn}\)) and standard deviation (\(\text{sd}\))

\[\begin{align*} \hat{r}_d & = \frac{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d})} \\ \hat{s}_d & = \text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d}) - \hat{r}_d\text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d}) \end{align*}\]

  1. To leave dimension 1 unchanged (\(\hat{r}_1 = 1\) and \(\hat{s}_1 = 0\)):

\[\begin{align*} \hat{r}_d & = \frac{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d})\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}1})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d})\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}1})} \\ \hat{s}_d & = \frac{\text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d}) - \text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{U}1}) + \frac{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}1})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}1})}\text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{M}1}) - \frac{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}d})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d})}\text{mn}(\hat{\boldsymbol{\delta}}_{\mathcal{M}d})}{\text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{U}1}) / \text{sd}(\hat{\boldsymbol{\delta}}_{\mathcal{M}1})} \end{align*}\]

Delta Dimensional Alignment (DDA)

Delta Dimensional Alignment (DDA)

Logistic Regression Alignment (LRA)

  1. Fit the following model for each dimension:

\[\text{logit}(P(x_{ij} = 1 \vert \hat{\delta}_{i(d)})) = \hat{\gamma}_{0d} + \hat{\gamma}_{1d}\hat{\delta}_{i(d)}\]

  1. Set \(\hat{r}_1 = 1\) and \(\hat{s}_1 = 0\). For the remaining dimensions:

\[\begin{align*} \hat{r}_d & = \frac{\hat{\gamma}_{1d}}{\hat{\gamma}_{11}} \\ \hat{s}_d & = \frac{\hat{\gamma}_{0d} - \hat{\gamma}_{01}}{\hat{\gamma}_{11}} \end{align*}\]

Simulation 1

Goals:

  • What factors affect alignment accuracy?

  • Any performance differences between DDA and LRA?

Factors manipulated:

  • Sample size: {200, 500, 1000}

  • Number of items per dimension: {5, 20}

  • Mean of \(\theta_2\): {0, .5, 1} (\(\theta_1\) was standard normal)

  • SD of \(\theta_2\): {.5, 1, 1.5, 2}

  • Skewness of \(\theta_2\): {0, .75}

  • Kurtosis of \(\theta_2\): {0, 3}

  • Correlation b/n \(\theta_1\) and \(\theta_2\): {0, .3, .6 .9}

  • 20 replications

Simulation 1 Results

Does alignment work?

Before Alignment After DDA After LRA
Median \(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\) \(.987\) \(1.000\) \(.997\)
\(\%\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})=1\) \(23\%\) \(55\%\) \(55\%\)
  • \(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\): absolute value of Kendall’s rank-order correlation between \(\hat{\boldsymbol{p}}\) and \(\hat{\boldsymbol{\delta}}\)

Does DDA or LDA work better?

  • No systematic differences found

Simulation 1 Results

When is alignment most useful?

Criterion: \(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\) before alignment

Predictors: All manipulated factors + interactions

Multiple \(R^2 =.38\)

Source df \(F\) \(\eta^{2}\)
\(\text{sd}(\theta_{2})\) 3 8483 \(.272\)
\(\text{kurt}(\theta_{2})\) 1 698 \(.007\)
\(n_{1}\times n_{2}\) 1 3397 \(.036\)
\(\text{sd}(\theta_{2})\times\text{kurt}(\theta_{2})\) 3 300 \(.009\)
\(n_{1}\times n_{2}\times\text{sd}(\theta_{2})\) 3 471 \(.015\)
  • \(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\) higher when \(\text{sd}(\theta_{2})\) closer to 1

  • \(\tau(\hat{\boldsymbol{p}},\hat{\boldsymbol{\delta}})\) slightly higher with higher \(\text{kurt}(\theta_{2})\)

  • More likely to have perfect alignment initially if \(n_{1}=n_{2}=5\)

Simulation 2

What is the relationship between \(\hat{r}_2\), \(\hat{s}_2\) and the latent trait distributions?

  • \(\theta_1 \sim N(01)\)

  • mean \(\theta_2 \in \{0, .5, 1, 1.5, 2\}\)

  • standard deviation \(\theta_2 \in \{.5, 1, 1.5, 2\}\)

  • skewness \(\theta_2 \in \{0, .5, 1, 1.5, 2\}\)

  • kurtosis \(\theta_2 \in \{-2, -1, 0, 1, 2, 3\}\)

Simulation 2 Results

Simulation 2 Results

Polytomous Models

Between-items multidimensional partial credit model (PCM):

\[\ln\left[\frac{P_{i(d)j}(x)}{P_{i(d)j}(x - 1)}\right] = \sum_{k=0}^x \alpha_d (\theta_{dj} - \xi_{i(d)k})\]

  • \(\xi_{i(d)k}\) is an item step parameter

where

\[P_{i(d)j}(x=0) = \frac{1}{\sum_{l=0}^{m_{i(d)}}\exp \sum_{k=0}^l(\alpha_d(\theta_d - \xi_{i(d)k}))}\]

  • \(m_{i(d)}\): number of response categories for item \(i\) on dimension \(d\)

\[\exp \sum_{k=0}^0 \alpha_d(\theta_{dj} - \xi_{i(d)k}) = 1\]

Polytomous Models

Another way to express the between-items multidimensional PCM:

\[\ln\left[\frac{P_{i(d)j}(x)}{P_{i(d)j}(x - 1)}\right] = \sum_{k=0}^x \alpha_d (\theta_d - \delta_{i(d)} - \tau_{i(d)k})\]

  • \(\delta_{i(d)}\): average item step parameter

  • \(\tau_{i(d)k}\): step deviation deviation

  • \(\xi_{i(d)k} = \delta_{i(d)} + \tau_{i(d)k}\)

Polytomous Models

Problem: \(\xi_{i(d)}\) parameters are not necessarily ordered

  • Indicate the location at which category response curves intersect

Alternative: Thurstone thresholds \(\lambda_{i(d)k}\)

  • Indicate location at which the probability of responding in category \(k\) or higher = .5

Polytomous Models

Polytomous Models

Transforming Polytomous Parameters

\[\begin{align*} \tilde{\theta}_{d} & =r_{d}\theta_{d}+s_{d}\\ \tilde{\alpha}_{d} & =\alpha_{d}/r_{d}\\ \tilde{\delta}_{i(d)} & =r_{d}\delta_{i(d)}+s_{d}\\ \tilde{\tau}_{i(d)k} & =r_{d}\tau_{i(d)k}\\ \tilde{\xi}_{i(d)k} & =r_{d}\xi_{i(d)k}+s_{d} \end{align*}\]

How to Align Polytomous Parameters?

Before, with dichotomous items:

  • Look at the relationship between \(\hat{\boldsymbol{p}}\) and \(\hat{\boldsymbol{\delta}}\)

Several quantities serve the role of the \(\delta_{i(d)}\) parameters

  • \(\xi_{i(d)k}\) are the quantities compared to \(\theta_d\) in the model equation

  • \(\delta_{i(d)}\) is an overall item location

  • \(\lambda_{i(d)k}\) indicates the location at which there is a .5 probability of a response in category \(k\) or higher

Sufficient statistics for the PCM, \(p_{i(d)k}\) are the proportions of responses that reach at least each score category \(k = 1, \ldots, m_{i(d)}\)

  • Within items, \(\xi_{i(d)k}\) not necessarily monotonically related to \(\hat{p}_{i(d)k}\)

  • Within items, \(\lambda_{i(d)k}\) necessarily monotonically related to \(\hat{p}_{i(d)k}\)

  • Working definition: dimensions are aligned if the same \(\hat{p}_{i(d)k}\) implies the same \(\lambda_{i(d)k}\) regardless of dimension

How to Align Polytomous Parameters?

Polyotmous DDA

DDA method:

  • Fit data to both unidimensional and multidimensional models

  • Find a transformation so that, for each dimension, the mean and standard deviation of the multidimensional parameters equal the mean and standard deviation of the unidimensional parameters

  • What parameter to use? All of these have similarities to the \(\delta_{i(d)}\) of binary models.

    • \(\xi_{i(d)k}\)

    • \(\delta_{i(d)}\)

    • \(\lambda_{i(d)k}\)

Remember: DDA relies on a strong within-dimension relationship between \(\mathcal{M}\) and \(\mathcal{U}\)

Polytomous DDA

  • \(\delta\) and \(\lambda\) may be appropriate, \(\xi\) probably is not

Polytomous DDA and LRA

Follow mostly the same procedures as before

DDA:

  • Use either \(\hat{\delta}_{i(d)}\) or \(\lambda_{i(d)k}\) in place of \(\delta_{i(d)}\)

LRA:

  • Use \(\lambda_{i(d)k}\) in place of \(\delta_{i(d)}\)

Evaluate alignment as the absolute rank-order correlation between \(\hat{\boldsymbol{p}}\) and \(\hat{\boldsymbol{\lambda}}\), across dimensions

Real Data Example

Kindergarten Individual Development Survey (KIDS)

  • Ratings taken on 59,429 kindergarten-age children in Illinois in spring 2015

  • 6 rating categories per measure (item)

Domains:

  • ATL-REG: Approaches to Learning and Self-Regulation - 4 measures

  • SED: Social and Emotional Development - 4 measures

  • LLD: Langauge and Literacy Development - 10 measures

  • COG: Math - 10 measures

  • PDH: Physical Development and Health - 9 measures

Real Data Example

Before alignment, \(\tau(\hat{\boldsymbol{p}}, \hat{\boldsymbol{\lambda}}) = .924\)

ATL-REG SED LLD COG PDH
DDA \(\hat{\boldsymbol{\delta}}\) \(\hat{r}\) \(1.000\) \(0.832\) \(0.973\) \(0.912\)
\(\tau(\hat{\boldsymbol{p}}, \hat{\boldsymbol{\lambda}})=.971\) \(\hat{s}\) \(0.000\) \(-0.005\) \(0.006\) \(0.009\)
DDA \(\hat{\boldsymbol{\lambda}}\) \(\hat{r}\) \(1.000\) \(1.097\) \(1.071\) \(1.034\)
\(\tau(\hat{\boldsymbol{p}}, \hat{\boldsymbol{\lambda}})=.937\) \(\hat{s}\) \(0.000\) \(0.092\) \(0.098\) \(0.071\)
LRA \(\hat{r}\) \(1.000\) \(1.095\) \(1.081\) \(1.050\)
\(\tau(\hat{\boldsymbol{p}}, \hat{\boldsymbol{\lambda}})=.934\) \(\hat{s}\) \(0.000\) \(0.072\) \(0.100\) \(0.129\)

Simulation

  • N = 500 subjects

  • 5 or 20 items per dimension

  • Correlation between dimensions: {0, .5}

  • \(\text{sd}(\theta_2)\): {.5, 1, 2}

  • response categories per item: {3, 5, 7}

  • Percentage of missing data: {0%, 5%, 50%}

Note: can always perform all 3 methods and select which works “best”, i.e., leads to the highest absolute rank-order correlations between \(\hat{\boldsymbol{\lambda}}\) and \(\hat{\boldsymbol{p}}\)

Simulation Results

Major findings:

  • DDA \(\hat{\boldsymbol{\lambda}}\) and LRA reliably led to increases in rank-order correlations, DDA \(\hat{\boldsymbol{\delta}}\) sometimes led to decreases in rank-order correlations

  • DDA \(\hat{\boldsymbol{\delta}}\) was the preferred method most often when \(\text{sd}(\theta_2) = 1\)

  • With 3 response categories, DDA \(\hat{\boldsymbol{\lambda}}\) tended to be selected most often

  • For 5 or 7 response categories, LRA tended to be selected most often

  • Absolute rank-order correlations generally higher for shorter tests and lesser proportions of missing data

  • No large effects associated with other manipulated factors

Conclusions

Summary:

  • A concrete definition of aligned scales

  • Evidence that alignment largely accounts for differences in latent trait distributions

  • 2 methods to transform fitted models

  • Preliminary evidence that both methods are effective, at least in some datasets

  • Encouragement to test all methods, use the best results

Future directions:

  • Align while fitting the model, rather than a post-hoc transformation

  • Polytomous models with different numbers of response categories

Software Implementation

Load in packages:

## load packages
library(TAM)
library(sirt)
library(scaleAlign)

Read in example data and define Q matrix:

# load data
data(data.reck61DAT2) # from sirt package
dat <- data.reck61DAT2$data[, c(1:10, 21:30)]
Q <- sapply(list(c(1, 0), c(0, 1)), rep, each = 10)

Fit initial model:

mod <- tam.mml(resp = dat, Q = Q, verbose = FALSE)

Software Implementation

Align using DDA:

aligned_DDA <- align(mod, "DDA1")
aligned_DDA$cor_before
[1] 0.9789474
aligned_DDA$cor_after
[1] 1

Align using LRA:

aligned_LRA <- align(mod, "LRA")
aligned_DDA$cor_before # same as aligned_DDA$cor_before
[1] 0.9789474
aligned_LRA$cor_after
[1] 1

Software Implementation

View transformation coefficients:

aligned_DDA$rhat
       1        2 
1.000000 1.086073 
aligned_DDA$shat
          1           2 
 0.00000000 -0.00131837 
aligned_LRA$rhat
[1] 1.000000 1.107417
aligned_LRA$shat
[1]  0.00000000 -0.01236789

The aligned_DDA and aligned_LRA objects are TAM objects, so any function that can be used with TAM objects can be used with the aligned model.

Software Implementation

View transformed difficulties:

round(data.frame(before = mod$xsi$xsi, 
                 DDA = aligned_DDA$xsi$xsi, 
                 LRA = aligned_LRA$xsi$xsi), 2)
   before   DDA   LRA
1    0.20  0.20  0.20
2    0.46  0.46  0.46
3    0.71  0.71  0.71
4    0.80  0.80  0.80
5    0.68  0.68  0.68
6    0.77  0.77  0.77
7    1.23  1.23  1.23
8   -0.29 -0.29 -0.29
9    0.31  0.31  0.31
10   0.19  0.19  0.19
11   0.11  0.11  0.10
12   0.71  0.71  0.70
13   1.08  1.08  1.07
14   2.55  2.55  2.54
15   0.80  0.80  0.79
16   0.48  0.48  0.47
17   1.29  1.29  1.28
18   0.10  0.10  0.09
19   0.42  0.42  0.41
20   0.70  0.70  0.69

`

head(mod$person)
  pid case pweight score max   EAP.Dim1 SD.EAP.Dim1    EAP.Dim2 SD.EAP.Dim2
1   1    1       1     3  20 -1.1971525   0.6786371 -0.61888966   0.5687768
2   2    2       1     3  20 -1.1971525   0.6786371 -0.61888966   0.5687768
3   3    3       1     8  20  0.3438237   0.5649660 -0.01823904   0.5296354
4   4    4       1    14  20  1.2335182   0.5851602  1.17575120   0.5200049
5   5    5       1     5  20 -0.6874662   0.6227283 -0.22287505   0.5418169
6   6    6       1     1  20 -1.4276290   0.7120677 -1.32802377   0.6251602
head(aligned_DDA$person)
  pid case pweight score max   EAP.Dim1 SD.EAP.Dim1    EAP.Dim2 SD.EAP.Dim2
1   1    1       1     3  20 -1.1971502   0.6786302 -0.67347797   0.6177026
2   2    2       1     3  20 -1.1971502   0.6786302 -0.67347797   0.6177026
3   3    3       1     8  20  0.3438138   0.5649624 -0.02110962   0.5752058
4   4    4       1    14  20  1.2335267   0.5851569  1.27556287   0.5647479
5   5    5       1     5  20 -0.6874625   0.6227231 -0.24339467   0.5884401
6   6    6       1     1  20 -1.4276486   0.7120637 -1.44357454   0.6789306

References

Feuerstahler, L. M., & Wilson, M. (2019). Scale alignment in between-item multidimensional Rasch models. Journal of Educational Measurement, 56(2), 280-301.

Feuerstahler, L. M., & Wilson, M. (2021). Scale alignment in the between-items multidimensional partial credit model. Applied Psychological Measurement, 45(4), 268-282.